Skip to content

Fix indefinite hang on OpenAPI schemas with cyclic model dependencies#3078

Merged
koxudaxi merged 27 commits intokoxudaxi:mainfrom
kevin-paulson-mindbridge-ai:slow_schema_generation_fix
May 7, 2026
Merged

Fix indefinite hang on OpenAPI schemas with cyclic model dependencies#3078
koxudaxi merged 27 commits intokoxudaxi:mainfrom
kevin-paulson-mindbridge-ai:slow_schema_generation_fix

Conversation

@kevin-paulson-mindbridge-ai
Copy link
Copy Markdown
Contributor

@kevin-paulson-mindbridge-ai kevin-paulson-mindbridge-ai commented Apr 10, 2026

We've been generating a model with the tool based on our OpenAPI schema, however since version 0.37.0 we've been unsuccessful in doing so. In the latest release (0.56.0) the generation was revealed to be stuck in sort_data_models. I waited up to an hour, I don't think it was ever going to complete. With this change it now works as expected.

I tried to follow https://datamodel-code-generator.koxudaxi.dev/development-contributing/ as best as I can, but let me know if you would like me to make any changes or if I should create an issue or anything else. Thanks!

Summary by CodeRabbit

  • Bug Fixes

    • Deterministic handling of circular dependencies in discriminated oneOf/allOf schemas.
    • Union type hints now omit duplicate members.
    • Safeguards added to skip invalid discriminator mappings when references are missing.
  • Tests

    • Added JSON Schema and OpenAPI regression tests for discriminated oneOf/allOf circular dependency scenarios.
  • Chores

    • Test tooling updated to include pytest-timeout.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 10, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: adae13b3-dcab-4b19-a1d7-00114776d094

📥 Commits

Reviewing files that changed from the base of the PR and between 9eaaaf9 and 269f6f2.

📒 Files selected for processing (5)
  • pyproject.toml
  • src/datamodel_code_generator/model/base.py
  • src/datamodel_code_generator/parser/base.py
  • tests/main/jsonschema/test_main_jsonschema.py
  • tests/main/openapi/test_main_openapi.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • pyproject.toml
  • src/datamodel_code_generator/model/base.py
  • tests/main/jsonschema/test_main_jsonschema.py
  • tests/main/openapi/test_main_openapi.py
  • src/datamodel_code_generator/parser/base.py

📝 Walkthrough

Walkthrough

Stabilizes data-model ordering to avoid non-terminating permutations for circular base-class graphs, adds a guard to skip discriminator path validation when base-class references are missing, deduplicates union type-hint members, updates test dependencies, and adds OpenAPI/JSON Schema regression tests plus expected outputs.

Changes

Dependency ordering, discriminator handling, model typing, tests

Layer / File(s) Summary
Type-hint shape
src/datamodel_code_generator/model/base.py
De-duplicate union member type-hint strings (order-preserving) in _build_union_type_hint and _build_base_union_type_hint.
Core ordering algorithm
src/datamodel_code_generator/parser/base.py
sort_data_models adds seen_orderings, computes ordering signatures from unresolved_references paths, and falls back to deterministic sort when a signature repeats to break permutation cycles. Also updates noqa to include PLR0914.
Discriminator validation guard
src/datamodel_code_generator/parser/base.py
__apply_discriminator_type now skips base classes without base_class.reference before calling path validation (check_paths(...)).
Test tooling / dependencies
pyproject.toml
Updates test dependency group: keep newer pytest/pytest-cov constraints and add pytest-timeout>=2.4.
Regression tests
tests/main/jsonschema/test_main_jsonschema.py, tests/main/openapi/test_main_openapi.py
Add 30s-timeout end-to-end regression tests for discriminated oneOf + allOf cycle specs using run_main_and_assert and assert_file_content.
Expected generated outputs
tests/data/expected/main/.../jsonschema_discriminated_oneof_allof_cycle.py, tests/data/expected/main/.../openapi_discriminated_oneof_allof_cycle.py
Add expected Pydantic modules that model the discriminated oneOf/allOf cycle and call model_rebuild() to resolve forward refs.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Suggested labels

breaking-change-analyzed

Poem

🐇
I hopped through loops both wide and deep,
Till seen-orderings woke me from sleep.
A stable sort, a guarded glance—
Now discriminators stand their chance.
Hooray, no more endless dance! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main objective: fixing an indefinite hang issue on OpenAPI schemas with cyclic model dependencies, which is the core problem addressed in the PR.
Docstring Coverage ✅ Passed Docstring coverage is 92.86% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 10, 2026

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

✅ 11 untouched benchmarks
⏩ 98 skipped benchmarks1


Comparing kevin-paulson-mindbridge-ai:slow_schema_generation_fix (269f6f2) with main (46cc740)

Open in CodSpeed

Footnotes

  1. 98 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/parser/test_base.py (1)

270-303: Consider asserting a postcondition, not just “no exception.”

This catches the crash, but it would be a bit stronger if it also pinned one observable outcome after __apply_discriminator_type() runs, so a future silent no-op doesn’t still pass.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/parser/test_base.py` around lines 270 - 303, The test currently only
ensures no exception is raised; add a concrete postcondition after calling
parser._Parser__apply_discriminator_type to ensure the union was not silently
modified by the discriminator logic: e.g., assert that union_inner still
contains exactly the two members referencing ref_pet and ref_other (check
identities of union_inner.data_types entries or their .reference attributes) and
that pet_model.base_classes remains unchanged — locate this in
test_apply_discriminator_type_skips_base_class_without_reference and add the
assertion(s) after the call to parser._Parser__apply_discriminator_type([root],
Imports()).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/parser/test_base.py`:
- Around line 270-303: The test currently only ensures no exception is raised;
add a concrete postcondition after calling
parser._Parser__apply_discriminator_type to ensure the union was not silently
modified by the discriminator logic: e.g., assert that union_inner still
contains exactly the two members referencing ref_pet and ref_other (check
identities of union_inner.data_types entries or their .reference attributes) and
that pet_model.base_classes remains unchanged — locate this in
test_apply_discriminator_type_skips_base_class_without_reference and add the
assertion(s) after the call to parser._Parser__apply_discriminator_type([root],
Imports()).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: af656bb8-01d5-4471-9440-e33695a9c92c

📥 Commits

Reviewing files that changed from the base of the PR and between 05901ff and b4e675c.

📒 Files selected for processing (2)
  • src/datamodel_code_generator/parser/base.py
  • tests/parser/test_base.py

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (46cc740) to head (269f6f2).

Additional details and impacted files
@@            Coverage Diff            @@
##              main     #3078   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           88        88           
  Lines        18512     18526   +14     
  Branches      2138      2140    +2     
=========================================
+ Hits         18512     18526   +14     
Flag Coverage Δ
unittests 100.00% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator

@ilovelinux ilovelinux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR implements cycle detection as it is usually done in graph algorithms (BFS, DFS, etc...).

LGTM! Thank you @kevin-paulson-mindbridge-ai 🙂

Copy link
Copy Markdown
Collaborator

@ilovelinux ilovelinux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kevin-paulson-mindbridge-ai, I realized it would be great to add some generation test for schemas with cyclic references in order to track the generated output of this corner-case.

See tests/main/{openapi,jsonschema,graphql}/test_main_*.py for examples. 🙂

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/main/openapi/test_main_openapi.py (1)

4756-4761: Add an explicit timeout to this hang-regression test.

The test for cyclic component reference handling (which previously caused indefinite hangs) has no timeout protection. If this regression returns, CI will stall until a global job timeout. Since run_main_and_assert() does not enforce a per-test timeout, add one directly to this test using @pytest.mark.timeout() to fast-fail if the code hangs again.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/main/openapi/test_main_openapi.py` around lines 4756 - 4761, This test
calling run_main_and_assert(...) for "cyclic_component_refs.yaml" lacks a
per-test timeout and can hang CI; add a pytest timeout decorator (use
`@pytest.mark.timeout`(seconds)) to the test function that invokes
run_main_and_assert for cyclic_component_refs.yaml so the test fails fast if it
hangs—place the decorator immediately above the test definition and choose a
reasonable timeout (e.g., 10–30s) to prevent indefinite hangs while allowing
normal runs.
tests/main/graphql/test_main_graphql.py (1)

924-929: Make the expected snapshot explicit in this regression test.

On Line 924, this relies on implicit run_main_and_assert behavior for expected output resolution. Please pass expected_file="cyclic_mutual_types.py" explicitly to keep the regression assertion unambiguous and resilient to helper changes.

Proposed patch
 def test_main_cyclic_mutual_types(output_file: Path) -> None:
     """Mutual object-type cycle (A <-> B) in GraphQL."""
     run_main_and_assert(
         input_path=GRAPHQL_DATA_PATH / "cyclic-mutual-types.graphql",
         output_path=output_file,
         input_file_type="graphql",
         assert_func=assert_file_content,
+        expected_file="cyclic_mutual_types.py",
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/main/graphql/test_main_graphql.py` around lines 924 - 929, Update the
test call to run_main_and_assert to pass the explicit expected snapshot filename
so the regression is unambiguous: modify the invocation of run_main_and_assert
(the one using input_path=GRAPHQL_DATA_PATH / "cyclic-mutual-types.graphql",
output_path=output_file, input_file_type="graphql",
assert_func=assert_file_content) to include
expected_file="cyclic_mutual_types.py" as an additional argument.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/main/graphql/test_main_graphql.py`:
- Around line 924-929: Update the test call to run_main_and_assert to pass the
explicit expected snapshot filename so the regression is unambiguous: modify the
invocation of run_main_and_assert (the one using input_path=GRAPHQL_DATA_PATH /
"cyclic-mutual-types.graphql", output_path=output_file,
input_file_type="graphql", assert_func=assert_file_content) to include
expected_file="cyclic_mutual_types.py" as an additional argument.

In `@tests/main/openapi/test_main_openapi.py`:
- Around line 4756-4761: This test calling run_main_and_assert(...) for
"cyclic_component_refs.yaml" lacks a per-test timeout and can hang CI; add a
pytest timeout decorator (use `@pytest.mark.timeout`(seconds)) to the test
function that invokes run_main_and_assert for cyclic_component_refs.yaml so the
test fails fast if it hangs—place the decorator immediately above the test
definition and choose a reasonable timeout (e.g., 10–30s) to prevent indefinite
hangs while allowing normal runs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6a41a0a5-bdda-41a2-953a-aae47badee83

📥 Commits

Reviewing files that changed from the base of the PR and between 7eaf6ab and 7786cf9.

⛔ Files ignored due to path filters (3)
  • tests/data/graphql/cyclic-mutual-types.graphql is excluded by !tests/data/**/*.graphql and included by none
  • tests/data/jsonschema/cyclic_mutual_defs.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/openapi/cyclic_component_refs.yaml is excluded by !tests/data/**/*.yaml and included by none
📒 Files selected for processing (6)
  • tests/data/expected/main/graphql/cyclic_mutual_types.py
  • tests/data/expected/main/jsonschema/cyclic_mutual_defs.py
  • tests/data/expected/main/openapi/cyclic_component_refs.py
  • tests/main/graphql/test_main_graphql.py
  • tests/main/jsonschema/test_main_jsonschema.py
  • tests/main/openapi/test_main_openapi.py
✅ Files skipped from review due to trivial changes (3)
  • tests/data/expected/main/openapi/cyclic_component_refs.py
  • tests/data/expected/main/jsonschema/cyclic_mutual_defs.py
  • tests/data/expected/main/graphql/cyclic_mutual_types.py

Copy link
Copy Markdown
Collaborator

@ilovelinux ilovelinux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I locally cherry-picked the last two commits (the ones that introduce the generation tests) to main. I expect them to fail (or hang) but they succeed instead.

Given remote:

$ git remote add kevin-paulson-mindbridge-ai https://github.com/kevin-paulson-mindbridge-ai/datamodel-code-generator.git
$ git fetch kevin-paulson-mindbridge-ai

To replicate. exec from main branch:

$ git cherry-pick 7786cf90867db72f716f1796ab071fefcffe4dc5
$ git cherry-pick 8d3ee3958ce8d9c595fdaa3d46a73bba2b54fa23
$ tox -e py314-parallel -- -k cyclic

This means that the cyclic model dependency added as test was already handled by the tool.


In fact, also the other tests (which implements the same cyclic model dependency corner-case) works on main.

$ git switch --detach kevin-paulson-mindbridge-ai/slow_schema_generation_fix
$ git restore src/datamodel_code_generator/parser/base.py --source main
$ tox -e py314-parallel -- -k cyclic

@kevin-paulson-mindbridge-ai could you double-check the corner-case you have in your OpenAPI schema? As now, it looks like the one you proposed was already supported, therefore we cannot reproduce the issue, validate the fix & add regression tests 🙁

…ctually reproduce the issue (don't complete in a reasonable time)
@kevin-paulson-mindbridge-ai
Copy link
Copy Markdown
Contributor Author

[...]

@kevin-paulson-mindbridge-ai could you double-check the corner-case you have in your OpenAPI schema? As now, it looks like the one you proposed was already supported, therefore we cannot reproduce the issue, validate the fix & add regression tests 🙁

My apologies, I neglected to validate that the new tests reproduced the issue before the change. I have updated the 3 tests such that they reproduce the issue before the change (they each ran at least 30 seconds before I killed them) and after the change they pass quickly (less than a second each). They look OK to me, but I am most familiar with OpenAPI and I am very unfamiliar with graphql.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/data/expected/main/jsonschema/cyclic_mutual_defs.py`:
- Around line 14-19: The generated classes Person and Team create an impossible
mutual inheritance cycle (Person(Team, Model) and Team(Person, Model)) causing a
runtime NameError; replace this mutual inheritance with a valid hierarchy by
removing the forward base reference—e.g., have both classes inherit from Model
only or make one inherit from the other (but not both) so that Person and Team
do not form a direct cycle; update the class headers for Person and Team (the
class definitions named Person and Team) accordingly and ensure any
tests/fixtures referencing Model still import/define Model as before.

In `@tests/data/expected/main/openapi/cyclic_component_refs.py`:
- Around line 14-19: The expected fixture currently creates an unimportable
cyclic base-class relationship between Person and Team (Person(Team, Model) and
Team(Person, Model)); fix it by removing the cyclic inheritance and making both
classes inherit only from Model (e.g., class Person(Model): and class
Team(Model):) and, if a cross-reference is needed, add attributes that reference
the other type as a forward annotation (e.g., person has team: "Team" | None =
None and team has member: "Person" | None = None) so the file can be imported
successfully while preserving the relationship between Person and Team.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c31b332b-2832-4806-b2b4-639415011af1

📥 Commits

Reviewing files that changed from the base of the PR and between 8d3ee39 and bb037b0.

⛔ Files ignored due to path filters (3)
  • tests/data/graphql/cyclic-mutual-types.graphql is excluded by !tests/data/**/*.graphql and included by none
  • tests/data/jsonschema/cyclic_mutual_defs.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/openapi/cyclic_component_refs.yaml is excluded by !tests/data/**/*.yaml and included by none
📒 Files selected for processing (3)
  • tests/data/expected/main/graphql/cyclic_mutual_types.py
  • tests/data/expected/main/jsonschema/cyclic_mutual_defs.py
  • tests/data/expected/main/openapi/cyclic_component_refs.py

Comment thread tests/data/expected/main/jsonschema/cyclic_mutual_defs.py Outdated
Comment thread tests/data/expected/main/openapi/cyclic_component_refs.py Outdated
@ilovelinux
Copy link
Copy Markdown
Collaborator

ilovelinux commented Apr 13, 2026

My apologies, I neglected to validate that the new tests reproduced the issue before the change.

That's not a problem, I appreciate your effort. Reviews & double-checks are useful because of that 🙂

I have updated the 3 tests such that they reproduce the issue before the change (they each ran at least 30 seconds before I killed them) and after the change they pass quickly (less than a second each). They look OK to me, but I am most familiar with OpenAPI and I am very unfamiliar with graphql.

I'll review it as soon as I'll get some free time 🙂

(they each ran at least 30 seconds before I killed them)

Could you add a timeout as suggested by the first nitpick comment here: #3078 (review)?

@kevin-paulson-mindbridge-ai kevin-paulson-mindbridge-ai marked this pull request as draft April 14, 2026 01:01
@kevin-paulson-mindbridge-ai
Copy link
Copy Markdown
Contributor Author

I think I've got things much closer now. I'll plan to come back to this to:

  1. Add a test for the 1 missed line
  2. Add similar tests for jsonschema and graphql

@kevin-paulson-mindbridge-ai kevin-paulson-mindbridge-ai marked this pull request as ready for review April 27, 2026 19:05
@kevin-paulson-mindbridge-ai
Copy link
Copy Markdown
Contributor Author

@coderabbitai resume

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 27, 2026

✅ Actions performed

Reviews resumed.

@kevin-paulson-mindbridge-ai
Copy link
Copy Markdown
Contributor Author

I think I've got things much closer now. I'll plan to come back to this to:

1. Add a test for the 1 missed line

2. Add similar tests for jsonschema and graphql

This is now ready for review. Note that I don't believe graphql has the same issue due to the code paths followed and the nature of graphql.

@koxudaxi koxudaxi dismissed ilovelinux’s stale review May 7, 2026 07:40

The requested changes were for an earlier version of the PR. The current regression cases reproduce the hang on main and pass on this branch, and the previous concerns appear addressed.

@koxudaxi
Copy link
Copy Markdown
Owner

koxudaxi commented May 7, 2026

@kevin-paulson-mindbridge-ai
Thank you for the great PR.
I reviewed it, and the previous concerns all appear to have been addressed, so I’ll merge it.

@koxudaxi koxudaxi merged commit a699137 into koxudaxi:main May 7, 2026
43 checks passed
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Breaking Change Analysis

Result: No breaking changes detected

Reasoning: All three changes are bug fixes with no breaking impact: (1) The cycle detection in sort_data_models only triggers for schemas that previously caused an infinite hang — there was no working output to break. (2) The union type hint deduplication changes output from e.g. Union[A, A, B] to Union[A, B], but these are semantically identical in Python — the generated code behaves exactly the same. This only applies in the narrow forward-reference fallback path. (3) The null-reference guard in discriminator handling prevents crashes on base classes without references, with no behavior change for schemas that previously worked. No CLI/API changes, no default behavior changes, no template changes, no Python version changes.


This analysis was performed by Claude Code Action

@kevin-paulson-mindbridge-ai kevin-paulson-mindbridge-ai deleted the slow_schema_generation_fix branch May 7, 2026 11:00
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

🎉 Released in 0.57.0

This PR is now available in the latest release. See the release notes for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants